Methods and Cost Models for XPath Query Processing in Main Memory Databases

نویسنده

  • Henning Rode
چکیده

Recent work on XPath evaluation has produced efficient relational index structures for maintaining and querying XML through a DBMS. Built on top of an relational encoding, named the XPath Accelerator, this thesis takes a closer look at its utilization within the scope of query processing. Basic XPath operations, such as axis steps and simple node tests, remain in the focus of the study. Appropriate database operations for their evaluation are introduced in the context of the main memory DBMS Monet. In those cases where the existing database operators fail to exploit the tree properties of XML data, new algorithms have been developed, designed specifically for evaluation of XPath axes. As an important step towards cost analysis for the proposed XPath operations, result size estimation is discussed in the trade off between accuracy and expense. Different methods show how statistical data as well as sampling techniques can be used for estimating result sizes of simple axis steps. The generation of cost functions mainly considers the time, that the XPath operations spend on data access. Even in main memory databases, CPU processing usually gets stalled for outstanding memory fetches. Therefore, our cost functions explicitly analyze the cache usage of the operations, adopting a hierarchical memory access model. Detailed tests demonstrate the accuracy and performance of the proposed result size and cost estimation techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relational Databases Query Optimization using Hybrid Evolutionary Algorithm

Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...

متن کامل

انتخاب مناسب‌ترین زبان پرس‌وجو برای استفاده از فرا‌‌پیوندها جهت استخراج داده‌ها در حالت دیتالوگ در سامانه پایگاه داده استنتاجی DES

Deductive Database systems are designed based on a logical data model. Data (as opposed to Relational Databases Management System (RDBMS) in which data stored in tables) are saved as facts in a Deductive Database system. Datalog Educational System (DES) is a Deductive Database system that Datalog mode is the default mode in this system. It can extract data to use outer joins with three query la...

متن کامل

Pushing XML Main Memory Databases to their Limits

The wide distribution of XML documents and the standardization of the Query languages XPath and XQuery have led to a wide variation of XML database implementations. Yet the efficient processing of really large XML documents is still supported by just a few products such as e.g. MonetDB/XQuery as open-source solution [1] or X-Hive as commercial product [2]. Following the main memory and relation...

متن کامل

FluXQuery: An Optimizing XQuery Processor for Streaming XML Data

XML has established itself as the ubiquitous format for data exchange on the Internet. An imminent development is that of streams of XML data being exchanged and queried. Data management scenarios where XQuery [11] is evaluated on XML streams are becoming increasingly important and realistic, e.g. in e-commerce settings. Naturally, query engines employed for stream processing are main-memory-ba...

متن کامل

A Dynamic Load-balancing Scheme for XPath Queries Parallelization in Shared Memory Multi-core Systems

Due to the rapid popularity of multi-core processors systems, the parallelization of XPath queries in shared memory multi-core systems has been studied gradually. Existing work developed some parallelization methods based on cost estimation and static mapping, which could be seen as a logical optimization of parallel query plan. However, static mapping may result in load imbalance that hurts th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003